{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[View the runnable example on GitHub](https://github.com/intel-analytics/BigDL/tree/main/python/nano/tutorial/notebook/inference/pytorch/accelerate_pytorch_inference_gpu.ipynb)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Accelerate PyTorch Inference using Intel ARC series dGPU"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can use Nano API to enable the Intel ARC series dGPU acceleration for PyTorch inference. It only takes a few lines."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To apply Intel ARC series dGPU acceleration, there're several steps for tools installation and environment preparation. "
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Step 1, please refer to our [drive installation](https://dgpu-docs.intel.com/installation-guides/index.html#intel-arc-gpus) for general purpose GPU capabilities. "
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Step 2, you also need to install OneAPI base tool kit [Download the Intel® oneAPI Base Toolkit](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html). OneMKL and DPC++ compiler are needed, others are optional."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Step 3, install proper BigDL-Nano for PyTorch inference using the following commands."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "vscode": {
     "languageId": "shellscript"
    }
   },
   "outputs": [],
   "source": [
    "pip install --pre bigdl-nano[pytorch_20_xpu] -f https://developer.intel.com/ipex-whl-stable-xpu # prepare proper nano environment and its dependencies\n",
    "\n",
    "source bigdl-nano-init --gpu # enable nano environment"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> 📝 **Note**\n",
    "> \n",
    "> Currently Intel ARC series dGPU acceleration for BigDL-Nano is only supported on Linux platform. "
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's take an [ResNet-50 model](https://pytorch.org/vision/main/models/generated/torchvision.models.resnet50.html) pretrained on ImageNet dataset as an example. First, we load the model:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from torchvision.models import resnet50\n",
    "\n",
    "original_model = resnet50(pretrained=True)\n",
    "original_model.eval()"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To enable Intel ARC series dGPU acceleration for your PyTorch inference pipeline, **the major change you need to make is to import BigDL-Nano** `InferenceOptimizer`**, and trace your PyTorch model to convert it into an** `PytorchIPEXPUModel` **for inference by specifying device as \"GPU\"**:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from bigdl.nano.pytorch import InferenceOptimizer\n",
    "\n",
    "# default not use ipex optimization\n",
    "acc_model = InferenceOptimizer.trace(original_model, device=\"GPU\", use_ipex=False)\n",
    "\n",
    "# you can also choose to use ipex optimization\n",
    "acc_model = InferenceOptimizer.trace(original_model, device=\"GPU\", use_ipex=True)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> 📝 **Note**\n",
    "> \n",
    "> Please refer to [API documentation](https://bigdl.readthedocs.io/en/latest/doc/PythonAPI/Nano/pytorch.html#bigdl.nano.pytorch.InferenceOptimizer.trace) for more information on `InferenceOptimizer.trace`."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Currently Intel ARC series dGPU acceleration also supports **fp16 precision** for your PyTorch inference pipeline. Only a few lines of changes are needed (see below)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from bigdl.nano.pytorch import InferenceOptimizer\n",
    "\n",
    "# default not use ipex optimization\n",
    "acc_model = InferenceOptimizer.quantize(original_model, device=\"GPU\", precision=\"fp16\", use_ipex=False)\n",
    "\n",
    "# you can also choose to use ipex optimization\n",
    "acc_model = InferenceOptimizer.quantize(original_model, device=\"GPU\", precision=\"fp16\", use_ipex=True)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> 📝 **Note**\n",
    "> \n",
    "> Please refer to [API documentation](https://bigdl.readthedocs.io/en/latest/doc/PythonAPI/Nano/pytorch.html#bigdl.nano.pytorch.InferenceOptimizer.quantize) for more information on `InferenceOptimizer.quantize`."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You could then do the normal inference steps **under the context manager provided by Nano**, with the model accelerated by Intel ARC series dGPU:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import torch\n",
    "\n",
    "with InferenceOptimizer.get_context(acc_model):\n",
    "    data = torch.rand(2, 3, 224, 224)\n",
    "    predictions = acc_model(data)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> 📝 **Note**\n",
    "> \n",
    "> Importantly, ipex (intel-extension-for-pytorch) is needed to be installed no matter ipex optimization is used or not. \n",
    "> \n",
    "> And for all Nano optimized models by `InferenceOptimizer.trace` or `InferenceOptimizer.quantize`, you need to wrap the inference steps with an automatic context manager `InferenceOptimizer.get_context(model=...)` provided by Nano. You could refer to [here](https://bigdl.readthedocs.io/en/latest/doc/Nano/Howto/Inference/PyTorch/pytorch_context_manager.html) for more detailed usage of the context manager."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> 📚 **Related Readings**\n",
    "> \n",
    "> - [How to install BigDL-Nano](https://bigdl.readthedocs.io/en/latest/doc/Nano/Overview/install.html)\n",
    "> - [How to enable automatic context management for PyTorch inference on Nano optimized models](https://bigdl.readthedocs.io/en/latest/doc/Nano/Howto/Inference/PyTorch/pytorch_context_manager.html)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "nano-pytorch",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "name": "python",
   "version": "3.7.15 (default, Nov 24 2022, 21:12:53) \n[GCC 11.2.0]"
  },
  "orig_nbformat": 4,
  "vscode": {
   "interpreter": {
    "hash": "8a5edab282632443219e051e4ade2d1d5bbc671c781051bf1437897cbdfea0f1"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
